10 research outputs found
On the Distribution of Control in Asynchronous Processor Architectures
Institute for Computing Systems ArchitectureThe effective performance of computer systems is to a large measure
determined by the synergy between the processor architecture, the
instruction set and the compiler. In the past, the sequencing of
information within processor architectures has normally been
synchronous: controlled centrally by a clock. However, this global
signal could possibly limit the future gains in performance that can
potentially be achieved through improvements in implementation
technology.
This thesis investigates the effects of relaxing this strict synchrony
by distributing control within processor architectures through the use
of a novel asynchronous design model known as a micronet. The impact
of asynchronous control on the performance of a RISC-style processor
is explored at different levels. Firstly, improvements in the
performance of individual instructions by exploiting actual run-time
behaviours are demonstrated. Secondly, it is shown that micronets are
able to exploit further (both spatial and temporal) instructionlevel
parallelism (ILP) efficiently through the distribution of control to
datapath resources. Finally, exposing fine-grain concurrency within a
datapath can only be of benefit to a computer system if it can easily
be exploited by the compiler. Although compilers for micronet-based
asynchronous processors may be considered to be more complex than
their synchronous counterparts, it is shown that the variable
execution time of an instruction does not adversely affect the
compiler's ability to schedule code efficiently. In conclusion, the
modelling of a processor's datapath as a micronet permits the
exploitation of both finegrain ILP and actual run-time delays, thus
leading to the efficient utilisation of functional units and in turn
resulting in an improvement in overall system performance
Improving Memory Hierarchy Utilisation for Stencil Computations on Multicore Machines
Although modern supercomputers are composed of multicore machines, one can
find scientists that still execute their legacy applications which were
developed to monocore cluster where memory hierarchy is dedicated to a sole
core. The main objective of this paper is to propose and evaluate an algorithm
that identify an efficient blocksize to be applied on MPI stencil computations
on multicore machines. Under the light of an extensive experimental analysis,
this work shows the benefits of identifying blocksizes that will dividing data
on the various cores and suggest a methodology that explore the memory
hierarchy available in modern machines
OMWS: A Web Service Interface for Ecological Niche Modelling
[EN] Ecological niche modelling (ENM) experiments often involve a high number of tasks to be performed. Such tasks may consume a significant amount of computing resources and take a long time to complete, especially when using personal computers. OMWS is a Web service interface that allows more powerful computing back-ends to be remotely exploited by other applications to carry out ENM tasks. Its latest version includes a new operation that can be used to specify complex workflows in a single request, adding the possibility of using workflow management systems on parallel computing back-end. In this paper we describe the OMWS protocol and compare its most recent version with the previous one by running the same ENM experiment using two functionally equivalent clients, each designed for one of the OMWS interface versions. Different back-end configurations were used to investigate how the performance scales for each protocol version when more processing power is made available. Results show that the new version outperforms (in a factor of 2) the previous one when more computing resources are used.The latest version of OMWS contains improvements coming from different sets of requirements originated from two projects that funded their corresponding implementation: EUBrazilOpenBio14, with grants from the European Commission and the National Council for Scientific and Technological Development of Brazil (CNPq) of the Brazilian Ministry of Science and Technology (MCT), and BioVeL, with grants from the European Commission. Server infrastructure was operated through a provisioning system developed in the frame of the Spanish project CLUVIEM (TIN2013-44390-R) funded by the "Ministerio de EconomÃa y Competitividad".Giovanni, RD.; Torres Serrano, E.; Amaral, RB.; Blanquer Espert, I.; Rebello, V.; Canhos, VP. (2015). OMWS: A Web Service Interface for Ecological Niche Modelling. Biodiversity Informatics. 10:35-44. https://doi.org/10.17161/bi.v10i0.4853S35441
TOWARDS OPTIMAL STATIC TASK SCHEDULING FOR REALISTIC MACHINE MODELS: THEORY AND PRACTICE
Task scheduling is a key element in achieving high performance from multicomputer systems. Efficient scheduling algorithms reduce the interprocessor communication and improve processor utilization. To do so effectively, such algorithms must be based on a communication cost model appropriate for computing systems in use. The optimal scheduling of tasks is NP-hard, and a large number of heuristic algorithms have been proposed for a range of differing scheduling conditions (graph types, granularities and cost or architectural models). Unfortunately, due both to the variety of systems available and the rate at which these systems evolve, an appropriate representative cost model has yet to be established. In this paper we study the problem of task scheduling unde
Towards an Effective Task Clustering Heuristic for LogP Machines
This paper describes a task scheduling algorithm, based on a LogP-type model, for allocating arbitrary task graphs to fully connected networks of processors. This problem is known to be NP-complete even under the delay model (a special case under the LogP model). The strategy exploits the replication and clustering of tasks to minimise the ill effects of communication overhead on the makespan. The quality of the schedules produced by this LogP-based algorithm, initially under delay model conditions, is compared with that of other good delay model-based approaches
using the UpRight Library APPROVED BY SUPERVISING COMMITTEE:
Firstly, I thank God, who works in ways beyond our understanding, but makes all things possible. I am grateful for the blessings I have been given in life — curiosity, skill and faith. Curiosity, to never stop asking questions and seek answers to them; skill, to solve the problems that I can solve on my own; faith to deal with those that I can’t. I am grateful for the blessing of a wonderful family, to which this thesis is dedicated. My parents, Santosh and Corinne have always been supportive of the choices I have made in my life. When I was unsure of whether pursuing a Master’s degree halfway across the world from home, was worth the cost and effort, they reassured me that it was. They were right. Without their love and support, I would not have been where I am today. My sister, Sonia, and her husband, Vinod, deserve their share of credit – their support, advice and reassurance provided me the motivation I needed in the last few months of pushing hard to get this work done. I thank Lorenzo Alvisi, my advisor, for the opportunity to work o